A Multi-Resolution Block Storage Model for Database Design
نویسندگان
چکیده
We propose a new storage model called MBSM (Multiresolution Block Storage Model) for laying out tables on disks. MBSM is intended to speed up operations such as scans that are typical of data warehouse workloads. Disk blocks are grouped into “super-blocks,” with a single record stored in a partitioned fashion among the blocks in a superblock. The intention is that a scan operation that needs to consult only a small number of attributes can access just those blocks of each super-block that contain the desired attributes. To achieve good performance given the physical characteristics of modern disks, we organize super-blocks on the disk into fixed-size “mega-blocks.” Within a megablock, blocks of the same type (from various super-blocks) are stored contiguously. We describe the changes needed in a conventional database system to manage tables using such a disk organization. We demonstrate experimentally that MBSM outperforms competing approaches such as NSM (N-ary Storage Model), DSM (Decomposition Storage Model) and PAX (Partition Attributes Across), for I/O bound decision-support workloads consisting of scans in which not all attributes are required. This improved performance comes at the expense of single-record insert and delete performance; we quantify the trade-offs involved. Unlike DSM, the cost of reconstructing a record from its partitions is small. MBSM stores attributes in a vertically partitioned manner similar to PAX, and thus shares PAX’s good CPU cache behavior. We describe methods for mapping attributes to blocks within super-blocks in order to optimize overall performance, and show how to tune the super-block and mega-block sizes.
منابع مشابه
A new 2D block ordering system for wavelet-based multi-resolution up-scaling
A complete and accurate analysis of the complex spatial structure of heterogeneous hydrocarbon reservoirs requires detailed geological models, i.e. fine resolution models. Due to the high computational cost of simulating such models, single resolution up-scaling techniques are commonly used to reduce the volume of the simulated models at the expense of losing the precision. Several multi-scale ...
متن کاملA High-Performance Database System for Managing Large Multi-resolution Medical Images
In this work we address the design of a database system to explore, process, and visualize very large (multiterabyte) multi-resolution image datasets, obtained from MRI, CT and ultrasound, and digitized microscopy images. The basic requirements for such a database management system include (1) support for adding and managing user-defined processing functions, (2) managing datasets stored in dis...
متن کاملA new multi-objective mathematical model for a Citrus supply chain network design: Metaheuristic algorithms
Nowadays, the citrus supply chain has been motivated by both industrial practitioners and researchers due to several real-world applications. This study considers a four-echelon citrus supply chain, consisting of gardeners, distribution centers, citrus storage, and fruit market. A Mixed Integer Non-Linear Programming (MINLP) model is formulated, which seeks to minimize the total cost and maximi...
متن کاملA new meta-data driven data-sharing storage model for SaaS
A multi-tenant database is the primary characteristic of SaaS, it allows SaaS vendors to run a single instance application which supports multiple tenants on the same hardware and software infrastructure. This application should be highly customizable to meet tenants’ expectations and business requirements. This paper examined current solutions on multi-tenancy, and proposed a new meta-data dri...
متن کاملConstituting a Receptor-Ligand Information Base from Quality-Enriched Data
Many different resources are needed for analyzing relevant experimental data in drug design. Currently this data is difficult to access, because it is stored in heterogeneous databases, spread over many platforms, poorly interconnected, incomplete, erroneous, or just not electronically available. In order to establish a high quality database for drug design we have developed a new demand-driven...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003